Deque-Free Work-Optimal Parallel STL Algorithms
نویسندگان
چکیده
This paper presents provable work-optimal parallelizations of STL (Standard Template Library) algorithms based on the workstealing technique. Unlike previous approaches where a deque for each processor is typically used to locally store ready tasks and where a processor that runs out of work steals a ready task from the deque of a randomly selected processor, the current paper instead presents an original implementation of work-stealing without using any deque but a distributed list in order to bound overhead for task creations. The paper contains both theoretical and experimental results bounding the work/running time.
منابع مشابه
Programming with the HPC++ Parallel Standard Template Library
We present an overview of the HPC++ Parallel Standard Template Library (PSTL), a parallel version of the C++ Standard Template Library (STL). The PSTL is part of HPC++, a C++ library and language extension framework being developed by the HPC++ consortium as a standard model for portable parallel programming in C++. The PSTL includes distributed versions of the seven STL containers (vector, lis...
متن کاملLock-Free and Practical Deques and Doubly Linked Lists using Single-Word Compare-And-Swap1
We present an efficient and practical lock-free implementation of a concurrent deque that supports parallelism for disjoint accesses and uses atomic primitives which are available in modern computer systems. Previously known lock-free algorithms of deques are either based on non-available atomic synchronization primitives, only implement a subset of the functionality, or are not designed for di...
متن کاملMethods of computing deque sortable permutations given complete and incomplete information
The problem of determining which permutations can be sorted using certain switchyard networks is a venerable problem in computer science dating back to Knuth in 1968. In this work, we are interested in permutations which are sortable on a double-ended queue (called a deque), or on two parallel stacks. In 1982, Rosenstiehl and Tarjan presented an O (n) algorithm for testing whether a given permu...
متن کاملInstall-Time System for Automatic Generation of Optimized Parallel Sorting Algorithms
Sorting is a fundamental algorithm used extensively in computer science as an intermediate step in many applications. The performance of sorting algorithms is heavily influenced by the type of data being sorted, and the machine being used. To assist in obtaining portable performance for sorting algorithms, we propose an install-time system for automatically constructing sequential and parallel ...
متن کاملLace: Non-blocking Split Deque for Work-Stealing
Work-stealing is an efficient method to implement load balancing in fine-grained task parallelism. Typically, concurrent deques are used for this purpose. A disadvantage of many concurrent deques is that they require expensive memory fences for local deque operations. In this paper, we propose a new non-blocking work-stealing deque based on the split task queue. Our design uses a dynamic split ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008